Large-Scale Statistical Machine Translation with Weighted Finite State Transducers
نویسندگان
چکیده
The Cambridge University Engineering Department phrasebased statistical machine translation system follows a generative model of translation and is implemented by the composition of component models of translation and movement realised as Weighted Finite State Transducers. Our flexible architecture requires no special purpose decoder and readily handles the large-scale natural language processing demands of state-of-the-art machine translation systems. In this paper we describe the CUED system’s participation in the NIST 2008 Arabic-English machine translation evaluation task.
منابع مشابه
Phrasal Segmentation Models for Statistical Machine Translation
Phrasal segmentation models define a mapping from the words of a sentence to sequences of translatable phrases. We discuss the estimation of these models from large quantities of monolingual training text and describe their realization as weighted finite state transducers for incorporation into phrase-based statistical machine translation systems. Results are reported on the NIST Arabic-English...
متن کاملACL 2008 THIRD WORKSHOP ON STATISTICAL MACHINE TRANSLATION http://www.statmt.org European Language Translation with Weighted Finite State Transducers: The CUED MT System for the 2008 ACL Workshop on SMT
We describe the Cambridge University Engineering Department phrase-based statistical machine translation system for SpanishEnglish and French-English translation in the ACL 2008 Third Workshop on Statistical Machine Translation Shared Task. The CUED system follows a generative model of translation and is implemented by composition of component models realised as Weighted Finite State Transducer...
متن کاملA phrase-level machine translation approach for disfluency detection using weighted finite state transducers
We propose a novel algorithm to detect disfluency in speech by reformulating the problem as phrase-level statistical machine translation using weighted finite state transducers. We approach the task as translation of noisy speech to clean speech. We simplify our translation framework such that it does not require fertility and alignment models. We tested our model on the Switchboard disfluency-...
متن کاملEfficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices
This paper presents an efficient implementation of linearised lattice minimum Bayes-risk decoding using weighted finite state transducers. We introduce transducers to efficiently count lattice paths containing n-grams and use these to gather the required statistics. We show that these procedures can be implemented exactly through simple transformations of word sequences to sequences of n-grams....
متن کاملCLSP Research Note No. 48 A Weighted Finite State Transducer Translation Template Model for Statistical Machine Translation
We present a Weighted Finite State Transducer Translation Template Model for statistical machine translation. This is a source-channel model of translation inspired by the Alignment Template translation model. The model attempts to overcome the deficiencies of word-toword translation models by considering phrases rather than words as units of translation. The approach we describe allows us to i...
متن کامل